Listen to Your Face: Inferring Facial Action Units from Audio Channel

نویسندگان

Zibo Meng

Shizhong Han

Yan Tong

چکیده

Extensive efforts have been devoted to recognizing facial action units (AUs). However, it is still challenging to recognize AUs from spontaneous facial displays especially when they are accompanied with speech. Different from all prior work that utilized visual observations for facial AU recognition, this paper presents a novel approach that recognizes speech-related AUs exclusively from audio signals based on the fact that facial activities are highly correlated with voice during speech. Specifically, dynamic and physiological relationships between AUs and phonemes are modeled through a continuous time Bayesian network (CTBN); then AU recognition is performed by probabilistic inference via the CTBN model. A pilot audiovisual AU-coded database has been constructed to evaluate the proposed audio-based AU recognition framework. The database consists of a “clean” subset with frontal and neutral faces and a challenging subset collected with large head movements and occlusions. Experimental results on this database show that the proposed CTBN model achieves promising recognition performance for 7 speech-related AUs and outperforms the state-of-the-art visual-based methods especially for those AUs that are activated at low intensities or “hardly visible” in the visual channel. Furthermore, the CTBN model yields more impressive recognition performance on the challenging subset, where the visual-based approaches suffer significantly.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Listen with your skin: aerotak speech perception enhancement system

Here we introduce Aerotak: A system for audio analysis and perception enhancement that allows speech perceivers to listen with their skin. The current system extracts unvoiced portions of an audio signal representative of turbulent air-flow in speech. It stores the audio signal in the left channel of a stereo audio output, and the air flow signal is stored in the right channel. The stored audio...

متن کامل

Analysis and Synthesis of Facial Expressions by Feature-Points Tracking and Deformable Model

Face expression recognition is useful for designing new interactive devices offering the possibility of new ways for human to interact with computer systems. In this paper we develop a facial expressions analysis and synthesis system. The analysis part of the system is based on the facial features extracted from facial feature points (FFP) in frontal image sequences. Selected facial feature poi...

متن کامل

Face reading from speech - predicting facial action units from audio cues

The automatic recognition of facial behaviours is usually achieved through the detection of particular FACS Action Unit (AU), which then makes it possible to analyse the affective behaviours expressed in the face. Despite the fact that advanced techniques have been proposed to extract relevant facial descriptors, the processing of real-life data, i. e., recorded in unconstrained environments, m...

متن کامل

Audiovisual Facial Action Unit Recognition using Feature Level Fusion

Recognizing facial actions is challenging, especially when they are accompanied with speech. Instead of employing information solely from the visual channel, this work aims to exploit information from both visual and audio channels in recognizing speech-related facial action units (AUs). In this work, two feature-level fusion methods are proposed. The first method is based on a kind of human-cr...

متن کامل

Recognizing Upper Face Action Units for Facial Expression Analysis

We develop an automatic system to analyze subtle changes in upper face expressions based on both permanent facial features (brows, eyes, mouth) and transient facial features (deepening of facial furrows) in a nearly frontal image sequence. Our system recognizes fine-grained changes in facial expression based on Facial Action Coding System (FACS) action units (AUs). Multi-state facial component ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

CoRR

دوره abs/1706.07536 شماره

صفحات -

تاریخ انتشار 2017

Listen to Your Face: Inferring Facial Action Units from Audio Channel

نویسندگان

چکیده

منابع مشابه

Listen with your skin: aerotak speech perception enhancement system

Analysis and Synthesis of Facial Expressions by Feature-Points Tracking and Deformable Model

Face reading from speech - predicting facial action units from audio cues

Audiovisual Facial Action Unit Recognition using Feature Level Fusion

Recognizing Upper Face Action Units for Facial Expression Analysis

عنوان ژورنال:

اشتراک گذاری